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i  ABSTRACT 

This  paper  studies  reputation  effects  in  games  with  a  single  long-run 
player  whose  choice  of  stage-game  strategy  is  imperfectly  observed  by  his 
opponents.   We  obtain  lower  and  upper  bounds  on  the  long-run  player's  payoff 
in  any  Nash  equilibrium  of  the  game.   If  the  long-run  player's  stage -game 
strategy  is  statistically  identified  by  the  observed  outcomes,  then  for 
generic  payoffs  the  upper  and  lower  bounds  both  converge,  as  the  discount 
factor  tends  to  1,  to  the  long-run  player's  Stackelberg  payoff,  which  is  the 
most  he  could  obtain  by  publicly  committing  himself  to  any  strategy. 

Keywords:   Reputation  effects,  commitment,  Stackelberg,  imperfect 
observability 

JEL  Classification:  Cll,  C70,  C72,  D74 

Proposed  Running  Head:   Reputation  and  Imperfect  Observability. 


1.   Introduction 

Ve  consider  "reputation  effects"  in  a  game  in  which  a  single  long-run 
player  faces  a  sequence  of  short-run  opponents,  each  of  whom  plays  only  once 
but  is  informed  of  the  outcomes  of  play  in  previous  periods.   A  number  of 
papers  have  studied  such  models  since  the  idea  was  introduced  by  Kreps- 
Wilson  [1982]  and  Milgrom-Roberts  [1982]  in  their  studies  of  the  chain- 
store  game.   Put  briefly,  the  idea  is  that  if  a  player  continually  plays  the 
same  action,  his  opponents  will  come  to  expect  him  to  play  that  action  in 
the  future.   Moreover,  if  the  opponents  are  myopic,  as  for  example  the 
short-run  entrants  in  the  chain-store  game,  then  once  they  become  convinced 
that  the  long-run  player  is  playing  a  fixed  stage-game  strategy  they  will 
play  a  best  response  to  that  strategy  in  subsequent  periods.   Foreseeing 
this  response,  the  long-run  player  may  choose  to  "invest  in  his  reputation" 
by  playing  the  strategy  even  when  doing  so  incurs  short-run  costs,  provided 
the  costs  are  outweighed  by  the  long-run  benefit  of  influencing  his 
opponents'  play. 

Intuitively,  one  might  expect  that  the  benefits  of  investing  in 
reputation  will  outweigh  the  costs  when  the  long-run  player  is  sufficiently 
patient.   This  intuition  is  most  clear  if  the  long-run  player's  stage-game 
strategy  is  perfectly  observed  by  his  opponents,  for  then  the  long-run 
player's  investments  have  a  direct  and  predictable  effect  on  the  evolution 
of  his  reputation.   In  many  situations  of  interest,  though,  the  long-run 
player's  stage-game  strategy  is  imperfectly  observed.   This  is  the  case  if 
the  long-run  player  uses  a  mixed  stage-game  strategy,  or  if  he  is  subject  to 
moral  hazard,  or  if  his  stage-game  strategy  prescribes  actions  for  contin- 
gencies that  need  not  arise  when  the  stage  game  Is  played.   In  all  of  these 
cases  the  short-run  players  must  try  to  infer  the  long-run  player's  past 


play  from  the  observed  information,  so  that  the  link  between  the  long-run 
player's  choices  and  the  state  of  his  reputation  is  weakened.   However, 
since  even  imperfect  observations  do  provide  some  information,  one  might 
expect  that  reputation  effects  still  have  some  force  in  these  cases.   Our 
goal  in  this  paper  is  to  explore  the  implications  of  reputation  effects  with 
imperfectly  observed  strategies ,  and  to  determine  when  and  how  the  intuition 
for  the  observed- strategy  case  needs  to  be  modified. 

To  model  reputation  effects,  one  assumes  that  the  short-run  players  are 
not  completely  certain  about  the  long-run  player's  payoff  function.   Follow- 
ing Harsanyi  [1967],  this  uncertainty  is  represented  by  a  prior  probability 
distribution  over  "types"  of  the  long  run  player.   In  particular,  it  is 
assumed  that  the  short-run  players  assign  positive  prior  probability  to  the 
long-run  player  being  a  "commitment  type"  who  will  play  the  same  stage-game 
strategy  in  every  period.   While  the  early  papers  on  reputation  effects 
solved  for  the  set  of  sequential  equilibria  for  a  given  prior  distribution, 
our  approach  here  and  in  our  [1989]  paper  is  to  find  bounds  of  the  long-run 
player's  payoff  that  hold  uniformly  over  a  range  of  prior  distributions  and 
over  all  of  the  Nash  equilibria  of  the  game. 

In  that  paper,  we  obtained  a  lower  bound  on  the  long-run  player's 
payoff  that  depended  only  on  which  "commitment  types"  have  positive  prior 
probability,  and  not  on  the  relative  probabilities  of  these  types  or  on 
which  other  types  were  in  the  support  of  the  prior  distribution.   For  simul- 
taneous-move stage  games  without  moral  hazard,  that  lower  bound  converges 
(as  discount  factor  goes  to  1)  to  the  highest  payoff  the  long-run  player  can 
obtain  by  publicly  committing  himself  to  any  strategy  for  which  the  corres- 
ponding commitment  type  has  positive  prior  probability.   (If  the  short-run 
players  are  indifferent  between  several  best  responses,  they  are  not 


required  to  choose  the  response  that  the  long-run  player  likes  the  best.) 
In  particular,  if  every  pure-strategy  commitment  type  has  positive  prior 
probability,  the  lower  bound  converges  to  the  highest  payoff  the  long-run 
player  can  get  by  committing  himself  to  a  pure  strategy. 

Our  [1989]  paper  also  considered  sequential -move  stage  games,  where  the 
observed  outcome  need  not  reveal  the  long-run  player's  choice  of  a  stage- 
game  strategy.   Here  the  lower  bound  is  in  general  weaker,  as  the  play  of 
the  short- run  players  may  not  give  the  long-run  player  a  chance  to  build  the 
reputation  he  desires.   However,  in  some  sequential -move  stage  games,  such 
as  the  chain- store  game  considered  by  Kreps -Wilson  and  Milgrom-Roberts,  the 
lower  bound  turns  out  to  be  as  high  as  if  the  game  had  sequential  moves. 

This  paper  improves  on  our  earlier  one  in  three  ways.   First,  the  lower 
bound  we  obtain  is  in  general  higher,  as  we  now  consider  the  possibility  of 
maintaining  a  reputation  for  mixed  strategies.   For  example,  in  the  Milgrom 
and  Roberts  version  of  the  chain  store  game,  in  which  some  of  the  entrants 
are  "tough"  and  will  enter  regardless  of  how  they  expect  the  incumbents  to 
play,  the  most  preferred  commitment  is  to  fight  with  the  minimum  probability 
required  to  deter  the  weak  entrants.   Given  that  reputation  effects  can 
allow  the  incumbent  to  commit  to  the  pure  strategy  of  always  fighting,  it 
seems  interesting  to  know  whether  reputation  effects  can  go  farther  and 
support  the  mixed  strategy  that  the  incumbent  prefers.  An  additional  reason 
for  interest  in  mixed- strategy  reputation  is  that  they  allow  the  short- run 
players  to  update  their  beliefs  in  a  way  we  find  more  plausible:   If  the 
only  commitment  types  are  those  who  always  fight,  then  if  the  incumbent  ever 
accommodates  he  is  thought  to  be  weak,  regardless  of  how  many  times  he  has 
fought  in  the  past.  We  find  it  more  plausible  that  an  incumbent  who  has 
fought  in  almost  every  previous  period  will  be  expected  to  fight  again  with 


high  probability,  and  our  model  allows  that  conclusion. 

The  second  improvement  over  our  earlier  paper  is  that  we  now  allow  for 
the  possibility  that  the  outcome  of  play  is  only  statistically  determined  by 
the  long-run  player's  action,  so  that  the  long-run  player  is  subject  to 
"moral  hazard"  in  trying  to  maintain  his  reputation.   One  example  of  this  is 
the  model  of  Cukierman-Meltzer  [1986],  where  a  long-run  central  bank  is 
trying  to  maintain  a  reputation  for  restraint  in  the  control  of  the  money 
supply.   Individuals  do  not  observe  the  bank's  action,  which  is  the  rate  of 
money  growth,  but  instead  observe  realized  inflation,  which  is  influenced  by 
money  growth  and  also  by  a  stochastic  and  unobserved  shock.   Thus  unexpect- 
edly high  inflation  could  either  mean  that  the  shock  was  high  or  that  the 
bank  increased  the  money  supply  more  rapidly  than  had  been  expected,  which 
might  seem  to  make  it  more  difficult  for  reputation  effects  to  emerge.   Our 
results  show  that  the  addition  of  moral  hazard  does  not  change  the  basic 
reputation- effects  intuition  in  that  the  limiting  value  of  our  payoff  bounds 
is  independent  of  the  amount  of  noise  in  the  system,  so  long  as  the  outcome 
permits  the  "statistical  identification"  at  the  long-run  players  play. 
(However,  the  noise  can  lower  the  long-run  player's  equilibrium  payoff  for  a 
fixed  value  of  the  discount  factor.) 

Note  that  the  generalizations  to  mixed- strategy  reputations  and  to 
games  with  moral  hazard  are  quite  similar  in  a  formal  sense,  as  in  both 
cases  the  complication  is  that  the  observed  outcome  reveals  only  imperfect 
information  about  the  long-run  player's  unobserved  strategy:   In  the  case 
where  actions  are  observed,  as  in  the  chain-store  game,  the  long-run 
player's  realized  action  will  not  reveal  the  randomizing  probabilities  that 
the  long-run  player  used.   This  is  why  it  is  natural  to  consider  the  two 
generalizations  in  the  same  paper. 


A  third  way  this  paper  improves  on  our  earlier  work  is  that  we  now 
obtain  an  upper  bound  on  the  long-run  player's  payoff  in  addition  to  the 
lower  bound.   The  upper  bound  converges,  as  the  long-run  player's  discount 
factor  approaches  one,  to  the  long-run  player's  "generalized  Stackelberg 
payoff",  which  is  a  generalization  of  the  idea  of  the  Stackelberg  payoff. 
The  generalized  Stackelberg  payoff  can  be  greater  than  the  limit  of  the 
lower  bound  of  the  long-run  player's  equilibrium  payoff,  and  in  general 
games  reputation  effects  do  not  always  lead  to  sharp  predictions.   However, 
if  the  stage-game  has  simultaneous  moves  (or  on  even  a  weaker  condition  we 
call  "identified"),  and  the  prior  distribution  has  full  support  on  the  set 
of  all  commitment  types  (including  those  corresponding  to  mixed  strategies) 
then  for  the  generic  payoffs  the  upper  and  lower  bounds  both  converge  to  the 
Stackelberg  payoff,  and  reputation  effects  have  very  strong  implications 
indeed.   This  conclusion  emphasizes  the  difference  between  games  such  as  the 
chain-store  example,  with  a  single  long-run  player,  and  games  with  several 
long-run  players,  such  as  the  repeated  prisoner's  dilemma  considered  by 
Kreps  et  al.  [1982]:  With  several  long-run  players  the  limit  set  of 
equilibrium  payoffs  with  reputation  effects  need  not  be  a  singleton,  and  can 
depend  on  the  relative  probabilities  of  various  "commitment  types"  (Aumann- 
Sorin  [1989],  Fudenberg-Maskin  [1986]). 

Here  is  an  intuition  for  our  results  for  games  in  which  the  realized 
stage-game  strategies  are  observed,  that  is,  simultaneous -move  games  without 
moral  hazard.   Fix  a  Nash  equilibrium  a       of  the  game,  and  suppose  that  the 
long-run  player  (of  whatever  type)  decides  to  play  the  equilibrium  strategy 

<7-(w)  of  a  type  w  in  the  support  of  the  prior  distribution.   Since  the 

*  - 

short-run  players  are  myopic,  they  will  play  a  best  response  to  a- («)   in 

any  period  where  they  expect  player  l's  stage -game  strategy  to  be  close  to 


a. (w) .   Conversely,  if  the  short-run  players  do  not  play  a  best  response  to 

*  - 
a. (w) ,   then  one  would  expect  them  to  be  "surprised"  if  that  strategy  is 

indeed  played.   That  is,  we  would  expect  them  to  increase  the  probability 

they  assign  to  the  long-run  player  being  type  w.   This  is  reflected  in  the 

fact  that  when  a. (w)   is  a  pure  strategy,  the  posterior  probability  that 

*  - 
w  -  u  increases  in  any  period  where  the  forecast  differs  from  a. (w) .   If 

a.  (w)   is  a  mixed  strategy  (or,  more  generally,  the  long-run  player's  stage- 
game  strategy  is  not  directly  observed)  the  analogous  statement  is  that 
there  is  a  non-negligible  probability  that  the  outcome  causes  the  short-run 
players  to  revise  their  beliefs  by  a  non-negligible  amount.  More  precisely, 
the  martingale  convergence  theorem  implies  that  for  any  e   there  is  a  K(e) 
such  that  with  probability   (l-O   the  short-run  players  will  expect  player 
1  to  play  a- (w)   in  all  but  K(e)   periods. 

To  obtain  the  desired  payoff  bounds,  we  must  strengthen  this  assertion 
by  finding  an  upper  bound  on  the  K(e)   that  holds  uniformly  over  all  Nash 
equilibria  for  a  given  discount  factor  and  also  over  all  discount  factors. 
Then,  when  the  long  run  player  is  very  patient,  what  his  opponents  play  in 
the  fixed  number   K(e)   of  "bad"  periods  is  unimportant. 

Finally,  we  get  an  upper  bound  on  payoffs  by  taking  w  to  be  the  long- 
run  player's  true  type,  while  we  get  a  lower  bound  by  taking  u>  to  be  a 
"type"  committed  to  playing  the  Stackleberg  strategy. 

2.   The  Model 

The  long-run  player,  player  1,  plays  a  fixed  stage  game  against  an 
infinite  sequence  of  different  short- run  player  2's.   In  the  stage  game 
player  1  selects  an  action  a..   from  a  finite  set  A-  ,   while  that  period's 
player  2  selects  from  a  finite  set  A„.   Denote  action  profiles  by 


a  €  A  «  A.  x  A..   The  stage  game  is  not  required  to  be  simultaneous  move, 
but  is  allowed  to  correspond  to  an  arbitrary  game  tree,  so  that  the  "act- 
ions" should  be  thought  of  as  contingent  plans  or  pure  strategies  for  the 
stage  game.  At  the  end  of  each  period,  the  players  observe  a  stochastic 
outcome  y  which  is  drawn  from  finite  set  Y  according  to  the  probability 
distribution  p(«|a).   This  outcome  is  defined  to  include  all  of  the 
information  players  receive  about  each  others'  actions.  The  case  where 
actions  are  directly  observed  is  modelled  by  identifying  a  distinct  outcome 
y(a)   with  each  action  profile,  then  setting  p(y(a)|a)  -  1. 

There  are  two  reasons  that  the  outcome  y  need  not  reveal  the  action 
profile.   First,  if  the  profile  represents  a  strategy  in  an  extensive  form 
stage  game,  then  even  if  the  outcome  y  is  deterministic  it  will  not  reveal 
how  players  would  have  played  at  information  sets'  that  were  not  reached 
under  a.   This  possibility  is  illustrated  in  Figure  1,  where  the  outcomes 
are  identified  with  the  terminal  nodes  of  the  stage  game.   Here  the  outcome 
"no  sale"  does  not  reveal  the  quality  that  would  have  been  sold  if  the 
consumer  had  bought. 

[Figure  1  About  Here] 

Second,  even  if  the  actions  are  uncontingent  choices,  the  distribution 
of  outcomes  for  a  fixed  action  profile  may  be  stochastic,  so  that  the 
outcome  gives  only  imperfect  statistical  information  about  the  actions. 
This  is  the  case  for  the  example  in  the  Cukierman-Meltzer  [1986]  paper  on 
inflation  and  monetary  policy. 

Corresponding  to  the  A.  are  the  spaces  A.  of  mixed  actions;  when 
the  mixed  action  profile  is  a  €  A-  x  A  the  resulting  probability  of  y 
is 


P(y|°)  -  )   P(y|a)a1(a1)a2(a2) 


aeA 

(Note  that  this  formulation  includes  the  special  case  where  A  and  Y  are 
isomorphic. ) 

We  wish  to  define  the  outcome  y  to  contain  all  of  the  information  the 
short-run  players  receive  about  the  long-run  player's  choice  of  action  a-. 

Accordingly,  we  require  that  their  payoff  depend  on  a-   only  through  its 

2 

influence  on  the  distribution  of  y.   The  short- run  players  all  have  the 

same  expected  utility  function  u  •  Y  X  A_  -»  R.   Let 

v2(a)  -    \    u2(y,a2)p(y|a)o1(a1)a2(a2) 
a€A,yeY 

denote  the  expected  payoff  corresponding  to  the  mixed  action  a.   Each 
period's  short-run  player  acts  to  maximize  that  period's  expected  payoff. 

All  players  know  the  short -run  players'  payoff  function.   On  the  other 
hand,  the  long-run  player  knows  his  own  payoff  function,  but  the  short-run 
players  do  not.   We  represent  their  uncertainty  using  Harsanyi's  [1967] 
notion  of  a  game  of  incomplete  information.   The  long-run  player's  payoff  is 
identified  with  his  "type"  w  e  0,  where  0  is  a  metric  space.   It  is 

common  knowledge  that  the  short-run  players  have  (identical)  prior  beliefs 

3 

H     about  u>,      represented  by  a  probability  measure  on  O.   As  with  short- 
run  players,  we  suppose  that  the  per  period  payoff  of  the  long-run  player 
depends  on  the  action  a„  of  his  opponent  only  through  its  influence  on  the 
distribution  of  y.   We  allow  this  utility  u.(a. ,y,w,t)   to  be  non-station- 
ary, and  assume  that  is  bounded  uniformly,  so  that  for  some  u  <  u, 
u  i  u1(a.,y,w,t)  ^  u  for  all  w  and  t.   The  overall  utility  is  the 
expected  average  discounted  value 


E  (l-S)   V  «t"1u1(a1(t),y(t),W,t) 


t-1 

where  OS  {  <  1.   The  normalization  by  (1-5)   place  per-period  and  repeated 
game  payoffs  on  the  same  scale.   As  in  the  case  of  the  short- run  player,  we 
may  define  the  expected  payoff  to  a  mixed  action: 

v^a.w.t)  -    \         u^a^y.u.t)  p(y|a)  a^a^a^a^  . 
aeA.yeY 

Both  long-run  and  short- run  players  can  observe  and  condition  their 
play  at  time  t  on  the  entire  past  history  of  the  realized  outcomes.   The 
long-run  player  can  also  condition  his  play  on  his  private  information  and 
on  his  own  past  actions.  Let  H   denote  the  set  of  possible  public  histor- 
ies (of  outcomes)  through  time  t,  including  the  null  history  h...   A  pure 
strategy  for  the  period- t  player  2  is  a  map  s  •  H  1  -*  A„ ,   while  the  set 
of  all  such  strategies  is  denoted  S- .   Let  H   denote  the  set  of  player 
l's  possible  private  histories  (the  past  realizations  of  a1 )   through  time 
t.  A  pure  strategy  s..   for  any  type  w  of  player  1  is  a  sequence  of  maps 
s.  :  H  -  x  H  -  -»  A-  ,   specifying  his  play  as  a  function  of  history;  the  set 
of  all  such  s.   is  denoted  S.  . 

Let  E-   and  E_  be  the  sets  of  probability  distributions  over  S- 

t  •    t 

and  S„,   let  5L.  ■  X  ,  S?.   Each  a   e  E..  x  E„  gives  rise  to  a  probability 

distribution  over  sequences  of  actions  and  outcomes.   Consequently  we  let 

E   denote  the  expectation  with  respect  to  this  distribution,  and  define 
a 

at 

U^a.w)  -  Ea(l-«)  £   *t"1  u1(a1(t),y(t),w,t) 
t-1 

to  be   the  expected  utility  to  player   1. 


10 


Strategies  for  the  long-run  players  must  also  account  for  the  fact  that 
there  are  many  types.   Since  we  wish  to  allow  ft  to  he  infinite,  we  follow 
Milgrom-Weber  [1985]  in  considering  distributional  strategies  s.   e  S- . 
These  are  joint  probability  distributions  over  ft  x  S-   with  the  property 
that  the  marginal  on  O  equals  ft.      Each  s.   and  0'  c  0  with  p(fl')  >  0 
gives  rise  to  conditional  strategies   s.(Q')  e  Z.   through  integration:   If 

s{  s  s1,  s1(0')(S[)  -  ^(Q'xsp/pCO'). 

A  Nash  equilibrium  can  now  be  defined  as  a  pair  s-  ,<7„  e  S.  x  E„   so 
that  <j„  is  a  best  response  to  s.  (ft)  and  so  that  if   (u,s.)  e  support  s.  , 
then  s-   is  a  best  response  to  a-  by  type  u>. 

Each  a  €  Z1  x  E„   induces  a  probability  distribution  over  H,  the 
public  histories  of  infinite  length.   Moreover,  each  h  €  H   may  be 
identified  with  the  subset  of  h  e  H  that  coincide  with  t  through  time  t. 
In  this  way,  we  may  view  the  H   (which  are  finite  sets)  as  sub-sigma 
algebra's  of  the  Borel  sets  in  H,   and  view  random  variables  on  H  as 

CO 

stochastic  processes  on  ((H  }   ,,H).   We  shall  adopt  this  point  of  view 
frequently  in  the  sequel. 

Since  our  main  concern  will  be  the  evolution  of  the  posterior 
probabilities  over  the  long-run  player's  type,  it  is  convenient  to  have  a 
special  notation  to  express  the  likelihood  function  for  the  event  that  u> 
lies  in  various  subsets  of  0.   Fix  s.  ,  <7_  and  a  subset  0  CO,   with 
ji(f}+)  >  0.   Let  0"  -  n\0   be  the  complement  of  fl  .   We  let  a.    -  s.  (0) , 
a.    -  s- (Q  )   and  ct-  -  s. (Q  )   be  the  induced  probability  distribution  over 
strategies  corresponding  to  all  types,  types  in  n   and  types  in  Q 
respectively.   We  also  set  a   —  a.    X  a_,  a     —  a.    x  a   a     -  a^    x  a..      The 
corresponding  probability  distributions  on  outcomes  at  time  t  conditional  on 


h 


-   are  denoted  p(h  .),  p  (h  1)   and  p  (h  1)   respectively.   These  are 
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families  of  random  vectors  on  H.   Similarly,  we  can  set  a- (h  -),   a. (h  .), 
and  a1  (h   )   to  be  the  time-t  probabilities  of  actions  by  player  1  given 
h  .,   and  set  ^(h,.  -i)   to  be  the  conditional  probability  of  actions  by 
player  2.   Note  that  ao(h..  ,)  ■  a„(h     -)   is  independent  of  a...   Since  the 
expected  conditional  expectation  equals  the  expectation,  we  may  calculate 

CO 

U^a+.w)  -  E  +  Y   «t"1v1(a*(ht _^  ,<*2(\ .1),w.t). 

Finally,  the  conditional  probability  of  a  type  in  0   under  a     given  h 
is  denoted  /i(0  |h  .. ) . 

The  power  of  reputation  effects  depends  on  which  reputations  are  a 
priori  feasible  and  this  depends  on  which  types  have  positive  prior  probab- 
ility.  To  model  reputations  for  always  playing  a  particular  pure  action,  we 
use  "commitment  types"  who  prefer  to  always  play  that  action.   There  are 
several  ways  of  constructing  the  model  so  that  the  long-run  player  has  the 
option  of  trying  to  maintain  a  reputation  for  playing  a  mixed  action.   The 
simplest  way  to  do  this  supposes  there  are  "commitment  types"  who  like  to 
play  specific  mixed  actions.  While  this  may  not  be  completely  implausible, 
it  has  the  awkward  feature  that  such  types  cannot  be  expected  utility 
maximizers.  Alteratively,  reputations  for  mixed  actions  can  be  modelled  in 
an  expected  utility  framework  with  the  following  technical  device.   Let 
0.  c  0  be  the  "irrational  types".   We  identify  0.   with  A.    x  A™,   the 
product  of  mixed  actions  with  the  space  of  sequences  of  actions .   (The 

CO 

latter  is  a  compact  space  in  the  product  topology.)   If  w  -  (a- ,  {a.,  (t)  )   .) 
then  u-(a1,y,w,t)  -  u  if  a-  -  a-(t)   and  ^(a^-.y.w.t)  -  u  if 

CO 

a.  r*   a-(t),   so  that  almost  surely  type  w  will  play   (a..(t)}   ,   in  a  Nash 
equilibrium  of  the  repeated  game.   Note  that  irrational  types  are  expected 
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utility  maximizers,  but  have  non- stationary  time -additive  preference:   each 
strictly  prefers  a  particular  sequence  of  actions  to  all  others.  We  let 
fl(a-)  be  the  set  {wGQ- |«  -  (a,,  (a.(t) }  ,)   for  some   (a-Ct)}™.).  Given 
fi,     probabilities  conditional  on  the  sets  fl(a, )   exist  for  almost  all  a 
On  the  other  hand,  playing  a.   independently  in  each  period  by  Kolmogorov's 
Theorem  induces  a  unique  probability  distribution  on  0(a1 ) .   We  assume  that 
H     is  such  that  the  conditional  probabilities  on  0(a. )   are  equal  to  this 
distribution  almost  surely.   In  this  sense  0(a,  )   can  be  viewed  as  a  kind 
of  "commitment  type":   conditional  on  u  lying  in  this  set,  the  unique 
dominant  strategy  leads  to  a  probability  distribution  over  actions  for  the 
long-run  player,  that  is  "as  if"  he  mixed  independently  following  a.. 

The  prior  /i  induces  a  measure  rj     on  the  set  of  mixed  actions  by 
r)(J\L)   -  /j({U   fl,n(a.))).   If  the  part  of  t)      that  is  absolutely  continuous 
with  respect  to  Lebesgue  measure  is  non-zero  and  has  a  density  that  is 
uniformly  bounded  away  from  0,   we  say  that  commitment  types  have  full 
support. 

As  an  example,  to  model  a  single  type  who  likes  to  randomize  with  a1 
equals  H-4  between  actions  H  and  T,   we  introduce  a  set  of  types  0(a. ) 
corresponding  to  sequences   (H,H,H,...)  (T,H,H,...),  (H.T.H.T, . . . )  and  so 
forth,  together  with  the  induced  probability  distribution  from  i.i.d.  coin 
flips. 

3.   Self-Confirming  Responses  and  Equilibrium  Payoffs 

This  section  develops  a  theorem  on  the  upper  and  lower  bound  of  the 
long-run  player's  Nash  equilibrium  payoffs.   The  proof  uses  a  result  about 
Bayesian  inference,  proved  in  the  next  section,  that  provides  uniform  bounds 
on  how  often  the  short- run  players  can  be  "substantially  wrong"  in  their 
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forecast  of  the  long-run  players  play. 

Fix  fi.,  a-     and  0   with  fi(Q   )  >  0.   Note  that  p(h  .)   is  the 
forecast  the  period-t  player  2,  makes  about  period-t  outcomes,  knowing 
s.  x  a».   By  way  of  contrast  p  (h  .)   is  what  he  would  forecast  if  he  knew 
that  the  long-run  player's  type  is  in  ft  .   If  p  is  a  probability  distrib- 
ution over  outcomes,   m  -  1.....M,   define   ||p|  ■  max  |p  || .   In  the  next 
section  we  prove  the  following  result. 

Theorem  4.1:   For  every  «  >  0,   AQ  >  0  and  ft   with  /j(0  )  >  0  there  Is 

a  K  depending  only  on  these  three  numbers  such  that  for  any  s-  and  a0 , 

under  the  probability  distribution  generated  by  s. (ft  ),   there  is  probabil- 
ity less  than  e   that  there  are  more  then  K  periods  with 

|p+(ht.1)-p(ht.1)l  >  V 

Loosely  speaking,  if  ft   is  true,  the  short -run  players  forecast  the 
outcome  y  about  as  well  in  almost  every  period  as  they  would  if  they  knew 
that  ft   was  true .   (This  is  loose  because  p  and  p   depend  on  the 
short-run  player's  action  a„). 

This  section  uses  Theorem  4.1  to  characterize  the  long-run  player's 
equilibrium  payoffs.   To  begin,  we  define  what  it  means  for  the  short-run 
player's  action  to  be  a  best  response  to  approximately  correct  beliefs  about 
the  distribution  over  outcomes,  as  opposed  to  beliefs  about  the  long-run 
player's  action.   Because  we  will  assume  that  commitment  types  have  full 
support,  all  mixed  actions  by  player  1  have  positive  probability  in  any  Nash 
equilibrium.   Thus  no  player  2  will  ever  choose  a  weakly  dominated  strategy, 
and  we  exclude  these  strategies  in  our  definition  of  a  best  response. 
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Definition:   A  mixed  action  a.   Is  an  e -confirmed  best  response  to  a.   If 

4 

(1)  a.   is  not  weakly  dominated 

(2)  there  exists  a'      such  that 

(a)   e«2  solves  max^,  v^a^.a'^) ,      and 


(b)   Dp(.|(o1.«2))  -  Pi'\(ct[,a2))^  <  €. 


We  let  B  (a..)   denote  the  set  of  all  e -confirmed  best  responses  to 
a1 .   Note  that  Bn(a.)   is  not  the  same  as  the  set  of  all  (undominated)  best 
responses  to  a..  ,   as  there  may  be  distinct  strategies  a-   and  a'   with 
p(«  |  (a..  ,a„))  -  p('\(a'a„))  ,      and  BQ(a-)   then  contains  best  responses  to 
both  a.      and  a'   For  example,  in  the  game  of  Figure  1,  "buy"  is  the 
unique  best  response  to  "high  quality",  but  "don't  buy"  is  also  in 
Bfi   (high  quality)  as  "don't  buy"  is  a  best  response  to  "low  quality",  and 
the  profiles  (high  quality,  don't  buy)  and  (low  quality,  don't  buy)  lead  to 
the  same  terminal  nodes.   In  the  terminology  of  our  [1989]  paper  the 
elements  of  Bf.(a1)  are  generalized  best  responses. 

We  now  relate  this  to  Nash  equilibrium  payoffs.   First  note  that  a  Nash 
equilibrium  exists  in  each  finite-horizon  truncation  of  the  game  (see  Mil- 
grom-Weber  [1985]).   Compactness  and  the  fact  that  preferences  are  uniformly 
continuous  in  the  product  topology  then  implies  the  existence  of  a  converg- 
ent sequence  of  truncated  equilibria  whose  limit  is  an  equilibrium  in  the 
infinite  game.   (See  Fudenberg-Levine  [1983],  for  example.)   For  a  long-run 
player  of  type  w,   if  /*(  (<■>))  >  0,   we  can  define  N..(S,w)   and  N.  (£,(■>) 
to  be  the  infinum  and  supremum  of  his  payoff  in  any  Nash  equilibrium. 

In  equilibrium,  if  the  commitment  types  have  full  support,  the  fact  that 
short-run  players  play  myopically  implies  that  ao(h  , )  e  Bn(o- (h  ..)). 
Moreover,  if  ||p  (h  . )  -  p(h  .)|  <  A. ,   as  in  the  conclusion  of  Theorem 
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4.1,  then  in  equilibrium  «2(ht-l)  €  BA  ^l^t-l^' 

We  now  focus  our  attention  on  a  particular  class  of  long-run  players. 
Type  u>Q     is  time -stationary  if  u^a^y.Wg,  t)  -  u^a^y.Wg,  t'  )   for  all 
t,t'.   For  such  a  type  v.   is  time-stationary  as  well.   Our  main  theorem 
provides  bounds  on  the  equilibrium  payoff  of  time -stationary  types  in  terms 
of  the  "e -least"  and  "e -greatest"  commitment  payoffs,  which  we  will  now 
define.   The  t -least  commitment  payoff  is 

vl(u,o  -  sup^  infa2eB£(ai)  Var«Vw)  "  £- 

This  is   €   less  than  the  least  type  w     gets  by  committing  to  any  fixed 
strategy  when  the  short-run  player  plays  an  e -confirmed  response.   Note  that 
the  definition  allows  player  2  to  choose  the  response  player  1  likes  least 
whenever  he  is  indifferent  between  two  or  more  responses.   This  is  a  pessi- 
mistic measure  of  the  power  of  commitment.   The  e -greatest  commitment  payoff 
is 

vl(w,o  -  sup^  ■«P«2eB-(.1)  V°r<Vw)- 


Obviously 


v^w.e)  >  v^w.e). 


For   e  -  0  we  call  v.(u,0)   the  generalized  Stackelberg  payoff.   Since  the 
supremum  is  taken  over  all  generalized  best  responses  to  a. ,   instead  of 
only  the  best  responses,  the  generalized  Stackelberg  payoff  is  at  least  as 
large  as  the  usual  Stackelberg  payoff  (modulo  our  restriction  to  undominated 
responses) . 

If  the  observed  outcomes  correspond  to  the  terminal  nodes  of  an 
extensive  form  stage  game,  then  the  generalized  Stackelberg  payoff  is  the 
same  as  the  usual  one.   To  see  this,  recall  that  if  a„  is  a  generalized 
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best  response  to  a.,   then  there  is  an  a'      such  that  (a'  a„)   and 
(o1 ,a_)  generate  the  same  probability  distribution  over  outcomes.   Thus,  if 
outcomes  correspond  to  terminal  nodes,   (a'  a.)   and  (a.,aj   generate  the 
same  distribution  over  terminal  nodes  and  hence  give  player  1  the  same 
expected  payoff.   Thus  player  1  can  do  as  well  by  playing  a'   and  having 
player  2  play  a  best  response  to  player  l's  true  action. 

However,  when  the  outcomes  do  not  correspond  to  the  terminal  nodes,  the 
generalized  Stackelberg  payoff  can  be  strictly  greater  than  the  usual  one. 
As  an  example,  consider  a  simultaneous  move  game  in  which  player  1  chooses 
U  or  D,   and  player  2  chooses  L  or  R.   There  are  three  outcomes  that 
occur  in  a  deterministic  manner:   If   (U,L)   or   (D,L)   is  played  the  out- 
come is  y1  ,   if   (U,R)   is  played  the  outcome  is  y„ ,   while   (D,R)   leads 
to  y_.   Player  l's  payoff  function  is  u..(U,y..)-  2;  u..(D,y..)  -  0; 

ul^**y2^  "  °'   "l/*'^  "  1>  Plaver  *-'s  payoffs  are  u2(L,«)  -  1; 
u0(»,y_)  -  2  and  u„(«,y.)  -  0.   The  strategic  form  is  shown  in  Figure  2. 

[Figure  2  About  Here] 

Here  the  generalized  Stackelberg  payoff  is  2,  which  is  attained  by  player  1 
playing  U,   and  player  2  playing  L.   L  is  a  generalized  best  response  to 
U,   as  L  is  a  best  response  to  D,   and  p(»|(U,L))  -  p(»|(D,L)).   In 
contrast,  the  Stackelberg  payoff  is  only  1,  which  is  attained  by 
a-  -  (WU.HD).  We  now  state  our  main  result. 

Theorem  3.1:   If  wfi  is  a  stationary  type  with  >i({cc>0})  >  0,   and  commit- 
ment types  have  full  support,  then  for  all  e  >   0  there  exists  K  so  that, 
for  all  6, 
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S  (l-e)«K  v^Mq.O  +  [1-(1-£)«K]G. 

Proof:  As  we  remarked  earlier,  because  commitment  types  have  full  support, 
player  2  never  plays  a  strategy  that  is  weakly  dominated  by  any  mixed 
strategy.   Consequently,  in  any  Nash  equilibrium  (s. ,<0  ,   if 
||p+(ht.1)-P(ht.1)B  *  A0,  a^j   G  BAo(a^(ht.1)). 

To  establish  the  upper  bound,  take  n  -  {«n).  We  have 

00 

Ul(a+,a,0)  -  E  +<l-«  Y      «t"Sct(bt.1).«2Cht.1)i.0) 

fel 

as  the  payoff  to  w..   Choosing  A.  -  e ,   we  conclude  from  Theorem  4.1  that 

there  is  a  K  such  that  with  probability   (1-e),   player  2's  equilibrium 

action  a„(h  -)   lies  in  B  (a, (h  .))   in  all  but  K  periods.   Since 
2  t-1  e  1  t-1  r 

a.(h     1)   is  the  expected  play  of  type  w.   given  history  h  1   (averaging 

over  the  different  private  histories  h  ..   consistent  with  h  .),   we 

conclude  that  type  <•>  's  expected  equilibrium  payoff  is  bounded  by 

v.  («_,«)   in  all  but  the  K  "exceptional"  periods.   Since  payoffs  in  the 

exceptional  periods  are  bounded  above  by  u,   and  the  present  value  is 

maximized  if  the  payoffs  u  occur  in  the  first  K  periods,  the  upper  bound 

follows. 

To  establish  the  lower  bound  choose  c'    >   0  so  that  if   |a*-a  |  <  «' 
then  flv^a^a^.Wg)  -  v^a^.o^.Wg)  ||  <  e      and 

||p(.|(a^,a2))  -  p(.|(alta2))|l  <  e/2   for  all  a^      (Such  an  «'   exists 
since  v  and  p     are  continuous  functions  on  compact  sets.)   Fix  an  a..  , 


and 


take  Q   to  be  the  union  of  fi(a' )   over   |a'-a..|  <  e'.   Note  that 
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since  a-i(h  ,)   is  a  convex  combination  of  ai      satisfying   |a'-a.|  ^  «', 

|a.(h  -)-a.|  <t   £'.  Note  also  that  since  the  commitment  types  have  full 

support,  /*(fl  )   is  bounded  below  by  a  n  >  0     that  is  independent  of  a 

(but  in  general  depending  on  e ' ) .   Consequently  we  may  find  a  K  as  in  the 

conclusion  of  Theorem  4.1  that  is  independent  of  a.. 

Suppose  that  type  a>n  adopts  the  strategy  a.,   associated  with  Q   . 

Choosing  A.  -  «/2  we  see  that  under  a       there  is  probability  at  least 

(1-e)   that  in  all  but  K  exceptional  periods  <*2(h  .)  e  B  .-(a-(h  .)). 

Since  ||o*(ht_1)-o1||  <,   «',   ^(o*^^)  ,a2,«0)  -  v^o^.o^.Wg)  ||  <  e, 

and  so  except  in  the  exceptional  periods,   w_   gets  at  least 

min   _  .      .  v.(a.,a„,u>n)    -    «.   The  lower  bound  now  follows  from  taking  the 
Q^GB  (CU-  )   1   1  c      U 

supremum  over  a.  €  ». .  I 

It  might  be  thought  that  the  upper  bound  is  too  weak,  and  in  particular 
that  when  actions  are  observed,  for  any  discount  factor  the  highest  equilib- 
rium payoff  should  be  the  Stackelberg  payoff  v.Cw.O).   However,  this  is  not 
the  case.   For  a  fixed  discount  factor  a  type  may  receive  "information  rents" 
from  the  possibility  of  other  types  that  give  it  a  payoff  higher  than  the 
Stackelberg  level.   Consider  for  example  the  matrix  game  in  Figure  3,  where 
player  1  chooses  rows,  player  2  chooses  columns,  and  the  matrix  gives  the 
payoffs  of  type  w-  of  player  1  and  player  2 . 

[Figure  3  About  Here J 

Here  type  w  's  Stackelberg  payoff  is  1,  and  he  would  like  to  commit 
himself  to  U.   Type  w-  would  not  like  to  commit  himself  to  play  D, 
because  that  strategy  is  strictly  dominated.   However,  type  w0  would  like 
player  2  to  believe  that  he  was  playing  D,   for  this  induces  the  response 
of  L,   and  allows  him  a  payoff  of  2.   If  the  prior  distribution  places  high 
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probability  on  player  1  being  a  type  who  usually  plays  D,   then  player  2 
will  play  2  at  least  in  the  first  few  periods.  Thus  if  S     is  small,  type 
w_  could  obtain  a  payoff  of  almost  2  by  playing  U  while  player  2  played 
L.    However,  to  obtain  this  payoff  player  1  must  "fool"  player  2.  The 
intuition  for  the  upper  bound  is  that  the  short- run  players  cannot  be  fooled 
infinitely  often. 

One  example  from  the  literature  where  the  long-run  player  does  do  better 
than  his  Stackelberg  payoff  for  small  S     is  Benabou-Laroque  [1988].   They 
consider  a  model  of  an  informed  "insider"  who  knows  if  the  state  is  "good"  or 
"bad",  and  who  can  send  a  possibly  dishonest  report  of  his  information  to  the 
market.   The  "sane"  insider  would  like  to  mislead  the  market,  but  this  cannot 
occur  in  equilibrium  when  his  type  is  known;  the  sane  type's  commitment  payoff 
corresponds  to  revealing  no  information.  However,  if  the  market  believes 
there  is  positive  probability  that  the  insider  will  always  report  honestly, 
the  sane  type  can  play  a  mixed  strategy  that  announces  "good"  when  the  state 
is  "good"  with  probability  less  than  1/2,  but  such  that  marginal  distribution 
of  reports,  averaged  over  the  insider's  type,  is  that  an  announcement  of 
"good"  means  the  probability  that  the  state  is  good  exceeds  1/2.   Hence  the 
market  price  will  rise  when  the  insider  announces  "good",  and  the  "sane" 
insider  can  earn  a  rent  by  misleading  the  market.   Benabou-Laroque  are 
concerned  with  the  nature  of  the  equilibrium  strategies  for  a  fixed  S     and 
not  in  the  sorts  of  payoff  bounds  that  we  develop.   However,  it  is  interesting 
to  note  that  our  theorem  implies  that  the  sane  type's  payoff  converges  to  the 
no  communication  payoff  as   5   converges  to  1,  regardless  of  the  prior 
probability  that  the  speculator  is  honest. 

We  now  formalize  the  idea  information  rents  should  vanish  in  the  limit 
as  8-1.     Let  N-Cw)  be  the  lim  inf  of  N  (5,w)   and  let  N-(w)  be  the 
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lim  sup  of  N.,  ($,w)  . 

Corollary  3.2:   If  w_  is  a  stationary  type  with  /♦({«-))  >  0  and  the 


commitment  types  have  full  support, 

Proof:   Letting  S   -»  1  and  then  «  -»  0  In  Theorem  3.1  shows  that  we  need 
only  establish  that  11m  inf  v^w^.e)  £  v^w^.O)   and 

11m  sup  v1(wn,«)  3  v-Cw-.e).   This  in  turn  follows  from  the  upper  hemi- 
continuity  of  B  (a.)   in  «;   that  Is  e     -»  0,   a„  e  B   (a1 )   implies  that 

£ 

any  limit  point  a„  of  a-   lies  in  Bn(cO.  I 

When  is  the  bound  tight;  that  is,  when  are  the  lower  and  upper  bounds 
equal?   Roughly  speaking,  two  conditions  are  required:   The  short-run  play- 
ers should  care  about  how  the  long-run  player  plays,  and  the  outcome  y 
must  reveal  "enough"  statistical  Information  about  the  long-run  player's 
action. 

To  see  how  the  v.   and  v..   can  differ  if  the  short- run  player  does 
not  care  about  the  actions  of  the  long-run  player,  consider  a  game  in  which 
A-   is  a  singleton  and  A-  -  (0,1).   The  short-run  player's  payoff  is  zero 
no  matter  what  he  plays,  while   ui(ai ,a2,wo^  ™  a2"   Clearly  N. (wQ)  -  0 
and  N..  (wn)  "  1  regardless  of  the  discount  factor  or  the  presence  of 
commitment  types. 

For  the  two  bounds  to  be  equal  we  need  to  exclude  this  type  of  game . 
The  games  we  exclude  are  degenerate  In  the  sense  that  they  are  non- generic 
in  the  space  of  payoff  functions.   The  game  is  non- degenerate  if  there  is  no 
undominated  pure  action  a_  e  A„   such  that  for  some  a_  r*   a„ , 

v(«,a2)  -  v(.,o2). 
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This  condition  rules  out  the  degenerate  example  above,  but  is  satisfied 
for  an  open  dense  set  of  payoffs:   If  a„  is  undominated  and 
v2(»,a2)  -  v2(»,a2)  with  a2  +  a-,      then  in  the  game  with 

U2<y»a2>  "  u2^v'ap   for  a2  ■*  a2'   and  u2(y,a2^  "  u2^'a2^  '   €'      a2  is 
weakly  dominated  by  a?.   (It  can  also  be  shown  that  non- degenerate  games 

have  full  measure  in  the  space  of  payoff  functions . ) 

Next,  to  see  what  can  happen  when  the  outcome  does  not  reveal  sufficient 
information  about  the  long-run  player's  action,  consider  the  following  qual- 
ity choice  game  adopted  from  Fudenberg-Levine  [1989]  (see  also  Figure  1). 
The  long-run  player  chooses  a-  £  {high  quality, low  quality),   and  the  short- 
run  player  may  play  a„  e  {do  not  buy,  buy).   The  possible  outcomes  are 
Y  -  {no  sale,  buy  high  quality,  buy  low  quality),  corresponding  to  the  3 
terminal  nodes  of  the  extensive  form.   If  the  short -run  player  buys,  the 
outcome  is  buy  high  quality  or  buy  low  quality  according  to  a..  .   If  he  does 
not  buy,  the  outcome  is  no  sale  regardless  of  a. .   The  short-run  player  gets 

1  if  he  buys  high  quality,   -1  if  he  buys  low  quality  and  0  if  no  sale. 
Consider  a  type  w  of  long-run  player  who  gets  1  if  he  sells  high  quality, 

2  if  low  and  0  if  no  sale.  Clearly  v..(w,0)  -  1.5,   for  if  w  mixes  h-h 
between  high  and  low,  the  short-run  player  is  still  willing  to  buy.   If 

M(«)  £  .5,  however,  then  It  is  a  Nash  equilibrium  for  the  long-run  player  to 
play  low  quality,  and  for  the  short-run  player  to  not  buy,  implying 
v(w,0)  -  0  (which  is  the  individually  rational  payoff).  The  long-run  player 
cannot  build  a  reputation  for  producing  high  quality  because  the  short-run 
player  never  buys  and  so  never  observes  the  long-run  player's  action. 

To  rule  out  this  possibility,  we  use  the  following  condition:  The  game 
is  identified  if  for  all  a_  that  are  not  weakly  dominated, 
p(«|a..,a2)  -  p(»|a'a.)   implies  a.    -  a*   This  condition  requires  that 
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distinct  actions  of  the  long-run  player  induce  distinct  distributions  over 

outcomes.   Clearly  the  game  is  identified  if  the  long-run  player's  actions 

are  observed,  so  that,  for  every  y  there  is  a  unique  a.,   independent  at 

a„  such  that  p(y|a. ,a2)  >  0.   This  is  the  case  if  the  stage  game  is  a  one- 

shot  simultaneous  move  game,  and  there  is  no  moral  hazard. 

Even  if  there  is  moral  hazard,  the  game  may  be  identified.   Let  R(a.) 

be  the  matrix  with  columns  corresponding  to  outcomes  y,   rows  corresponding 

to  actions  a-   for  the  long  run  player,  and  entries  R    -  p(y|a1,a0). 
l  a^y       1  2 

Since  p(»\a^,a„)   -  a^RCa.),   if  R(a2)  has  full  row  rank  for  all  undomina- 
ted  a„,   the  game  is  identified.   It  might  seem  that  if  player  1  has  no 
more  actions  than  there  are  outcomes,  this  condition  is  generically  true  in 
simultaneous  move  games.   However,  this  is  somewhat  deceptive,   while  it  is 
true  that  if  the  number  of  outcomes  is  at  least  the  number  of  actions  by  the 
long  run  player  R(eO  will  generically  have  full  row  rank  for  all  pure 
strategies,   R(eO   will  generically  have  full  row  rank  only  for  almost  all 
mixed  strategies. 

As  an  example,  consider  a  game  in  which  both  players  have  two  actions, 
heads  H  and  tails  T,   and  there  are  two  outcomes,  also  called  H  and  T. 
If  o_  -  H,  y  -  a- .   If  a.  -  T,  then  the  outcome  is  the  opposite  of 
whatever  player  1  chose,  that  is,   (H,T)  produces  outcome  T,  while  (T,T) 
produces  outcome  H.  Thus  R(o„)  has  full  rank  except  for  a„  -   (HH.HT): 
in  this  case  each  outcome  has  probability  h     regardless  of  how  the  long-run 
player  plays,  so  the  game  is  not  identified.  Moreover,  perturbing  the 
information  structure  slightly  will  not  make  the  game  identified.   Fortunat- 
ely, many  simultaneous -move  economic  games  have  information  structures 
satisfying  natural  monotonicity  assumptions  that  rule  out  this  type  of 
singularity. 
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A  more  interesting  economic  example  is  that  of  a  repeated  signalling 
game.  Here  the  long-run  player  (worker)  draws  a  type  ("high  productivity," 
"low  productivity")   r.   from  a  finite  set  each  period  in  an  i.i.d.  manner. 
He  then  makes  a  decision  d1   ("go  to  school",  "do  not  go  to  school")  based 
on  his  type.  An  action  a-   is  a  map  from  types  to  decisions  d.  -  a.(r-). 
The  short  run  player  (firm)  moves  second  and  observes  the  decision  of  the 
long-run  player,  but  not  his  type.   He  then  makes  a  decision  d„.  An  action 
a,,  for  the  short-run  player  is  a  map  from  long-run  player  decisions  to 
short-run  player  decisions  d„  -  a„(d..).  At  the  end  of  the  period,  after 
the  short-run  player's  decision  is  final,  the  current  type  of  long-run 
player  r-   is  revealed  to  the  current  and  all  subsequent  short-run  players. 
Clearly,  the  game  is  identified,  since  any  mixture  over  maps  from  types  to 
decisions  induces  a  unique  distribution  over  pairs   (r,,d..),   both  observed 
ex  post.   Our  result  below  implies  that  in  the  repeated  signalling  game,  the 
worker  can  do  as  well  as  by  committing  to  any  map  from  type  to  schooling. 

In  contrast,  if  the  long-run  player  moves  after  the  short-run  player 
and  has  more  than  one  information  set,  the  game  will  typically  fail  to  be 
identified.   Even  if  players  observe  the  terminal  node,  so  the  short-run 
player  observes  the  way  the  long-run  player  played,  this  will  not  reveal  the 
stage-game  strategy  he  chose.   This  is  reflected  in  a  non-generic  R(o.) 
matrix.   In  the  game  in  Figure  1,  if  the  short  run  player  plays  "do  not 
buy,"  the  only  possible  outcome  is  "no  sale,"  and  the  corresponding  2x3 
matrix  R(a„)  has  rank  one. 

On  the  other  hand,  we  have  assumed  that  the  long-run  player  has  many 
types,  and  that  his  type  is  private  information.   It  may  be  reasonable  to 
suppose  that  the  same  is  true  of  the  short- run  players.   If  these  types  are 
also  chosen  independently  from  period  to  period,  and  there  are  sufficient 
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variety  of  types,  every  sequence  of  moves  of  the  short-run  player  will  have 
positive  probability.   In  this  case,  if  the  terminal  node  is  observed,  it  is 
clear  that  the  game  is  identified,  although  for  some  o?,   R(a-)  may  not 
have  full  row  rank,  and  indeed,  may  not  even  have  more  rows  than  columns. 

Theorem  3.3:   In  a  non- degenerate  game  that  is  identified,  then  for  any 
stationary  type  wQ,  v^Wq.O)  -  v^Uq.O). 

Proof:   Recall  that  Bn(a.)   is  the  set  of  weakly  undominated  a„   such  that 
there  exists  an  a'   with  p(«|a'  a-)  -  p(«|a1 ,a.)   and  such  that  at„   Is  a 
best -response  to  al .   Since  the  game  is  identified  the  only  such  a'   is 
a    so   B  (a  )   is  simply  the  undominated  best  responses  to  a 

Therefore,  it  suffices  to  show  that  for  a„  e  Bn(a.)   there  exists  a 
sequence  a-    -»  a.   such  that  Bfi(a1)  -  (a„).   Now  a„   is  by  definition 
undominated  and,  by  the  hypothesis  of  non- degeneracy,  does  not  yield  the 
same  vector  of  payoffs  to  player  2  as  any  other  mixed  strategy.   Thus  there 
exists  an  q^   such  that  a„   is  a  strict  best  response  to  a' .      Then, 
however,  it  is  a  strict  best  response  to  Aa..  +  (l-A)a'   for  all  0  <  A  <  1. 
Let  0  <  An  <  1  with  An  -  1.   Then  a"  -  Ana  +  (1-An)a'  -  a    and  a. 
is  the  unique  best  response  to  o.  .  I 

4.   Bayesian  Inference  and  Active  Supermartingales 

We  now  demonstrate  Theorem  4.1,  stated  in  the  previous  section,  that  it 
is  unlikely  that  forecasts  of  y  are  wrong  in  many  periods.   We  do  so  via 
several  lemmas  analyzing  the  odds  ratio  [l-/i(0  |h  )]/j»(Q  |h  )   for  arbit- 
rary sets  0  .   If  this  odds  ratio  is  low,   Q   is  likely  to  be  true,  so 
conditional  forecasts  of  y  under  a       are  close  to  those  under  a.      On  the 
other  hand,  when  conditional  forecasts  of  y  are  different  under  a        than 
under  a,   the  odds  ratio  has  a  good  chance  of  falling  substantially  if  the 
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true  strategy  is  a   .   Because  the  odds  ratio  is  a  supermartingale,  it 
converges  almost  surely.   Moreover,  it  is  well  known  that  the  odds  ratio 
sampled  only  at  periods  where  the  forecasts  of  y  are  different  under  a 
and  a       is  a  supermartingale  that  converges  to  zero  (see,  for  example, 
Neveu  [1975]).   We  strengthen  this  observation  to  show  that  for  a  fixed 
difference  between  the  conditional  forecasts  p  (h  1)   and  p(h  .),   the 
odds  ratio  converges  to  zero  at  a  uniform  rate,  independent  of  the 
particular  distributions  p   and  p. 

We  begin  by  defining  families  of  scalar  random  variables 

(P^(h).p^(h)).   Set  P*  ~  Pm(ht.!)   and  Pt  "  Pm(ht-1)   if  y(t)  is  the  ^ 

element  of  Y.   (Recall  that  h  e  H   is  the  finite  history  that  coincides 

with  h   through  and  including  time  t.)   Define  another  family  of  random 
variables  L  (h)   as  follows: 

LQ(h)  -  l*ffiX 
/*<0  > 


?'tW 


Lt<h)-^-TLt-i<h>- 

Pt(h) 

It  is  well  known  that  L  (h)  -  [l-/i(fl  |h  )]/a»(0  |h  ),   which  is  the  poster- 
ior odds  ratio  at  the  end  of  period  t  under  a       that  player  1  is  not  in 
0  .   It  is  also  well  known  that  this  odds  ratio  is  a  supermartingale  under 
the  distribution  a   .     We  give  a  proof  for  completeness. 


Lemma  4.1:   Lt(h)  -  [l-n(Cl    |ht)]//i(n  |ht)   and   (I^.iy   is  a 
supermartingale  under  a    . 

Proof:   The  first  claim  is  by  definition  true  for  L_.   Imagine  it  is  true 
for  L      then 
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[l-^(0+|ht)]//i(Q+|ht)  -  p;[l-/i(0+|ht>1)]/[p^(0+|ht  x)\ 

To  see  that  L  (h)   is  a  supermartingale,  recall  that  p  (h  .)   Is  the 
conditional  probability  over  y  from  a       so  that 

EtLtlLt-rht-il  "  Lt-i  s  p>t-i>  fVht-i>/P>t-i>> 

m 

"  Lt-1  S  P.<ht-1>  *   Lt-1'  ' 

m 

Let  A(h  ,)  -  ||p  (h  ,)  -  p"(h   )||   be  the  distance  between  the  distribu- 
tions over  outcomes  corresponding  to  O   and  0  -  0\fl  .   Note  that 
||p  (h  -)  -  p(h  -)|  <  A(h  .)   since  p(h  .)   is  a  convex  combination  of 

P  (\_i>      and  P*<ht.i>- 

Next  we  show  that  the  odds  ratio  L   is  likely  to  fall  substantially 

when  A(h  1)  >  AQ. 

Lemma  4.2:   If  h  .   has  positive  probability  and  A(h   . )  >  A_   then  under 
a+,   Pr[(Lt(h)At.1(h))  -  1  <  -AQ/M]  >  AQ/M. 

Proof:   Note  first  that  Lt/Lt.i  -  P^/P*-  which  is  Pi^t-l^Pl^t-l^  witn 

probability  P]/*^.!);  p2^ht-l^p2^ht-l*  with  Probabilltv  P2*ht-1^'   and 
so  forth  for  those  indices  m  for  which  p  *   0.   Consequently,  it  suffices 

to  show  for  some  m, 

Pm(ht-l)/pm(ht-l>  *   X  "  V"  and  P>t-1>  "  V*' 
By  hypothesis,  A(htl>  -  maxm  IPm<ht.1)  "  Pm(ht.i>l  -  Ao'   SuPPose 
without  loss  of  generality  that  this  maximum  occurs  at  m  —  1.   If 

Pi(ht-i)  "  Pi(ht-i)  ~  Ao  then  pi  -  Ao  and 

l  -  pi(tlt.i)/Pi(nt.i)  -  Ac/pi^ht-i^  ~  Ao*  so  we  are  done-  If>  on  the  otner 
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hand,   Pitt^)  -  Pi(ht.i>  *  V   then  )   (pa(ht-l)  "  Pm^t-l^  fe  V 

m>l 

Consequently  M  max  .  (p  (h  .)  -  P(hti)  ^  A0*   and>  ^or  m   "  2  say*  we 

have  P^^V  *)  "  P^^t-l^  ^  ^f/**'   A8ain«  we  conclude  Po^nt-l^  ^  ^o^1*  and 

1  ■  P2(ht-1)/P2(ht-1>  *  V1-  ' 

Lenma  4.2  shows  that  in  the  periods  where  the  conditional  distribution 
over  outcomes  induced  by  the  actions  of  types  in  0  —  Q\0   differs  signi- 
ficantly from  that  corresponding  to  O  ,   the  likelihood  ratio  is  likely  to 
jump  down  by  a  significant  amount.   Of  course,  in  periods  where  A(h  -)   is 
small,  the  likelihood  ratio  need  not  change  very  much.   The  key  to  our 
result  is  to  show  that  there  is  high  probability  that  there  are  few  periods 
in  which  both  the  odds  ratio  is  high  and  A(h  .)  >  A_.   To  prove  this,  we 
introduce  a  new  supermartingale  which  includes  all  of  the  periods  where 
A(h  . )  >  An  from  the  supermartingale  L(h) . 

We  first  define  a  sequence  of  stopping  times  relative  to  a  given 
supermartingale  L  -  L(h)   and  a  distance  AQ.   Set  rQ   -  0.   If 
r.  ,(h,A0)  -  «>,   set  Tk(h,AQ)  -  «  as  well.   If  rkl(h,A0)   is  finite,  set 
r.  (h,AQ)   to  be  the  first  time  t  >  r.  1(h,AQ)   such  that  either 

(1)  Pr[||Lt/Ltl-l||  <  AQ/M]  >  A0/M,   or 

(2)  L./L     -  1  *   V2M«   or 

C  Vl 

(3)  if  no  such  time  exists,  set  r,  (h,An)  -  <». 

Lemma  4.2  shows  that  this  sequence  of  stopping  times  picks  out  at  least  all 

the  date-history  pairs  for  which  A(h  - )  >  Afl. 

The  faster  process  L,   relative  to  L   and  A_  is  defined  by 

r,  -  <*>.      Since  the  r. 
k  k 


L,  -  L    for  r,    <   »,   and  L.  -  0  for  r,  -  «.   Since  the  r,   are 


k 
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stopping  times,   L   Is  a  supermartingale,  with  an  associated  filtration 
whose  events  we  denote  h,  .  Moreover,  we  will  show  that  L   is  an  "active" 
supermartingale  in  the  following  sense: 

Definition:   A  positive  supermartingale   (L  ,h,  )   is  an  active 
supermartingale  with  activity  ^  if 

for  almost  all  histories  h,   such  that  L,  >  0. 

Lemma  4.3:   For  any  O   with  /i(0  )  >  0,   and  any  A-  >  0,   the  associated 
faster  process  L,   is  an  active  supermartingale  under  a       with  activity 
AQ/2M. 

Proof:   Since  the  r,   are  stopping  times,  by  Lemma  4.1  L,   is  a 
supermartingale.   Next,  we  claim  that  if  h   is  such  that  L,  .  >  0, 

PrUli^/L^-lD  >  AQ/2M  Ih^]  >  AQ/M. 

To  see  this,  let  s  -  r,  .(h),   which  is  a  constant  with  respect  to  h,  .  ; 
r.  (h,  - )   is  a  random  variable.  We  will  show  that 

Pr[||Lr  /Ls-l||  >  AQ/2M  |hg]  >  AQ/M. 
k 

One  of  the  three  rules  in  the  definition  of  the  r's  must  be  used  to  choose 

r,  .   We  will  show  that  this  inequality  holds  conditional  on  each  rule,  and 

thus  that  it  holds  averaging  over  all  of  them.   Conditional  on  h  ,   if  rule 

(2)  or  (3)  is  used,  then  with  probability  one   |L  /L  -l|  >  AQ/2M.   If  rule 

k 
(1)  is  used, 

PrfL  /L.    ..-1  <  -An/M  |h  ,   {rule  1  used)]  >  A./M, 
1  r'     (r,  -1)        0    '  s  0' 

and  also  since  rule  (2)  was  not  used  at  the  date  r.  -  1  just  before  r,, 
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L(Vi)/L.  • l  <  V2*- 

Combining  the  last  two  Inequalities  shows  that 


PrL  /Ls  -  1  <  -(Aq/2M  +  A2/2M2)|  hs,  {rule  1)1  >  AQ/M. 
L  jf  J 

Since  -(AQ/2M  +  A2/2M2)  <  -AQ/2M,   we  conclude  that 

Pr[|Lf  As-1(  >  V2M'  V  {rule  1}1  >  A0/M- 

The  remainder  of  Theorem  4.1  follows  from  the  fact  that  active 
supermartingales  converge  to  zero  at  a  uniform  rate  that  depends  only  on 
their  initial  value  and  their  degree  of  activity. 

Theorem  A.l:   Let  i_  >  0,   e  >  0,   and  ^  e  (0,1)  be  given.   For  each  L, 
0  <  L  <  &n,      there  is  a  time  K  <  «>  such  that 

Pr[sUpk>K  VL]  *  l   •    € 
for  every  active  supermartingale  L.  with  L0  -  in  and  activity  r/>. 

This  theorem  is  proved  in  the  Appendix  using  results  about  upcrossing 
numbers .   The  key  aspect  of  the  Theorem  is  that  the  bound  K  depends  only 
on  i0  and  ^,   and  is  independent  of  the  particular  supermartingale 
chosen. 

We  can  now  conclude: 

Theorem  4.1:   For  every  e  >  0,   AQ  >  0  and  0   with  ^(0  )  >  0  there  is 

a  K  depending  only  on  these  three  numbers  such  that  for  any  s.  and  a., 

under  the  probability  distribution  generated  by  s. (0  ),   there  is  probabil- 
ity less  than  c  that  there  are  more  then  K  periods  with 


Bp+(ht_1)-p(htl)|  >  AQ. 


Loosely  speaking,  if  0   is  true,  the  short-run  players  forecast  the 
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outcome  y  about  as  well  in  almost  every  period  as  they  would  if  they  knew 
that  ft   was  true.   (This  is  loose  because  p  and  p   depend  on  the 
short-run  player's  action  a„). 

This  section  uses  Theorem  4.1  to  characterize  the  long-run  player's 
equilibrium  payoffs.  To  begin,  we  define  what  it  means  for  the  short-run 
player's  action  to  be  a  best  response  to  approximately  correct  beliefs  about 
the  distribution  over  outcomes,  as  opposed  to  beliefs  about  the  long-run 
player's  action.   Because  we  will  assume  that  commitment  types  have  full 
support,  all  mixed  actions  by  player  1  have  positive  probability  in  any  Nash 
equilibrium.   Thus  no  player  2  will  ever  choose  a  weakly  dominated  strategy, 
and  we  exclude  these  strategies  in  our  definition  of  a  best  response. 

Proof:   Set  L  -  A0/(1-AQ)  and  LQ  -  (l-/i(0  ))//i(0  ).   Since  by  Lemma  4.2 
the  faster  process  omits  only  observations  when  A(h  .)  <  A-,  we  conclude 
form  Lemma  4.3  and  Theorem  A.l  that  there  exists  K,   depending  only  on  Ln 
and  An,   so  that  with  probability  1-e   under  a        in  all  but  K  periods 
either  A(h  . )  2  A.  or  L  -  <,   L.   By  Lemma  4.1,  we  conclude 
Ltl  -  [l-/i(n+|htl)]//x(n+(ht  ml)   <   A0/(l-A0),   implying  either  M\_{)   *   AQ 
or  /i(Q+|htl))  >  1-AQ.   Since  ||p+(ht_1)-p(htl)||  <  A(htl) ,   the  former 
implies   Ip+(h  ,)-p(h  ,)|  <  AQ,  while  the  latter  implies 

»P+(ht.1)-p(ht_1)fl 

-  ||p+(ht_1)-[M(n+|hl;_1)p+(ht  x)  +  [i-/i(n+|ht_1)]p-(htl)]|| 

<  [l-M0+|htl)]||  P+(htl)    -   p"(htl)|| 

<  i  -  M(n+|htl)  <  aq.  i 
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NOTES 


Recall  that  the  set  of  sequential  equilibria  is  not  robust  to  small 
changes  in  the  prior  (Fudenberg,  Kreps,  and  Levine  [1988]). 

2 

An  alternative  interpretation  is  that  the  outcome  y  is  defined  to 

include  the  short-run  player's  payoff  as  well  as  a  "signal". 

3 

Throughout  the  paper  all  of  our  measure  spaces  are  topological  spaces 

endowed  with  the  Borel  a-algebra. 

4 
An  action  a„   is  weakly  dominated  if  there  exists  a'      such  that 

v„(a1 ,a')  >  v„(a1 ,a-)   for  all  a.    e  0- ,   with  strict  inequality  for  at 

least  one  a- . 

We  thank  an  anonymous  referee  for  pointing  this  out  to  us. 
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APPENDIX:   Active  Supermartingales 


Our  goal  is  to  prove 

Theorem  A.l:   Let  iQ  >  0,   e  >  0,   and  V  e  (0,1)  be  given.   For  each  L, 
0  <  L  <  <*0.   there  is  a  time  K  <  •  such  that 

Pr[suPk>R  Lj^L]  il-( 

for  every  active  supermartingale  L  with  L-  -  I-  and  activity  0.       I 

For  a  given  martingale  the  above  is  a  simple  consequence  of  the  fact 
that  L  converges  to  zero  with  probability  one.   The  force  of  the  theorem 
is  to  give  a  uniform  bound  on  the  rate  of  convergence  for  all  supermartin- 
gales with  a  given  activity  \f>     and  initial  value  i... 

Throughout  the  appendix  we  use  L  to  denote  any  supermartingale  that 
satisfies  the  hypotheses  of  Theorem  A.l.   To  prove  the  theorem,  we  will  use 
some  fundamental  results  from  the  theory  of  supermartingales,  in  particular, 
bounds  on  the  "upcrossing  numbers"  which  we  introduce  below.   These  results 
can  be  found  in  Neveu  [1975],  Ch.  II. 

Fact  A. 2:   For  any  positive  supermartingale,   Pr[sup  L,^cl  ^  niin(l,L0/c)  . 

Next,  fix  an  interval   [a,b] ,   0  <  a  <  b  <  «>,   and  define  U,  (a,b)   to 
be  the  number  of  upcrossings  of  [a,b]  up  to  time  k;   let  U  (a,b)  be  the 
total  number  of  upcrossings  (possibly  equal  to  «) . 

Fact  A. 3:   For  any  positive  supermartingale, 
Pr[Ua)(a,b)>N]  <,   (a/b)N  min(L0/a,l). 

This  is  known  as  Dubin's  inequality.   (See,  for  example,  Neveu  [1975],  p. 
27). 
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Next  we  observe  that  since  L  has  activity  j>,      it  makes  a  jump  of 
size  $     with  probability  at  least  ^  in  each  period  k  where  L   is 
nonzero.   Consequently,  over  a  large  number  of  periods  either  L  has  jumped 
to  zero  or  there  are  likely  to  be  "many"  jumps.   Specifically,  define  J, 
to  be  the  number  of  time  k'  <  k  that  |(L.  ,-/!.,  )-l|  >  \f>. 

Lemma  A. 4:   For  all  e  and  J  there  exists  a  K  such  that 

PrHJ^}  or  (1^-0}]  a:  1-e. 

Proof:   Because  L  has  activity  ^,   in  each  period  k' ,   either  L,  ,  —  0 
or  the  probability  of  a  jump  of  size  ij>     at  time  k'   exceeds  \f>.      Define  a 
sequence  of  indicator  functions  I,   by  1,-1  iff  {L.-0  or 
||L,/L.  ,-l|  >  tf),   and  set  S,  -  £.<-,  I..   Each  I,   has  expectation  at 
least  if>,      so  for  some  K  sufficiently  large,   Prob[S,>J]  >  1-e.   Now  if 
S  >  J,   then  either  L„  -  0  for  some  k  <  K,   in  which  case  L,  -  0  as 
well,  or  there  have  been  at  least  J  jumps  by  time  K.  I 

We  have  now  established  that  most  paths  of  L 

(1)  do  not  exceed  c  for  c  large,  (Fact  A. 2) 

(2)  make  "few"  upcrossings  of  any  positive  interval   [a,b]   (Fact  A. 3),  and 

(3)  either  make  "lots  of  jumps"  or  hit  zero  (Lemma  A. 4). 

We  will  use  these  three  conditions  to  show  that  for  K  large,  most  paths 
remain  below  L.  from  K  on.   To  do  so,  we  first  argue  that  most  paths  will 
pass  below  c  by  time  K. 

Divide  the  interval   [c,c]   into  I  equal  sub intervals  with  endpoints 
e-  —  c, . . . ,e_  -  —  c.   Then  define  the  events 


El   if  maXk<K  \  *  'C' 
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E„  If  at  least  one  of  the'  interval   [e«»e*+il   is  upcrossed  N  or  more 

times  by  time  K; 
E3  If  JR  <  J  and  1^  >  0, 

E4  if  mink<Ktk<^ 

By  judicious  choice  of  c,  I,  K,  N  and  J,  we  will  insure  that 

e£  c  E1   U  E2  U  E3  and  that  Pr^)  ,Pr(E2)  ,Pr(E3>  £  «/4.   This  will  yield 

our  preliminary  conclusion  that 

Prtmin^  \  <  Z]   -   Pr(E4>  >  l-(3e/4). 
If  we  then  choose 

c  -  (e/4)L, 
fact  A. 2  implies  that 

Pr[maXk<K  he  >  L  I  minfc<K  \   <  ~]    S  ^~  ~   e/4 
and  we  get  the  desired  conclusion  that 

Prtmaxk>K  K  >  L]  - 

Pr[maXk>K  \  >   L  I  mink<K  hi  K   £l  *  Pr[mink<K  \  <   £]  + 

Pr[maXk>K  \  >   L  I  mink<K  4  *  £l  *  Pr[mink<K  he  *  £l 
:S  («/4)  •  1  +  1  -  (3e/4)  -  «. 

Turning  first  to  E1 ,  we  can  again  use  Fact  A. 2  to  choose 

c  -  (4/Oi0 

and  insure  that  Pr(E-)  -  Pr(Ma:c^  L  >  c)  £  e/4.   Note  for  future 
reference  that  this  is  true,  regardless  of  how  we  pick  K. 

In  the  range  above  c,  when  ||L, /L  --lj  >  $,      ||L  -L  ,  ||  >  ^c.   Thus, 
if  we  choose 
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I  fc  2c/c>  +  1 

then  the  width  of  each  subinterval  is  less  than  fe/2.  This  means  that  each 
jump  of  relative  size  $     in  a  path  that  remains  between  c.  and  c  must 
cross  one  of  the  subintervals   [e. ,e.  .].  Moreover,  if  such  a  path  has  J 
or  more  jumps  across  subintervals,  it  must  cross  at  least  one  subinterval 
(J-1)/2I  -  1  times.   Consequently  if  we  choose 

(*)  N  <.   (J-D/2I  -  1 

then  E,  c  E-  U  E„  u  E,  as  required.   In  other  words,  a  path  that  does  not 
go  above  c,   that  does  not  upcross  any  subinterval  in   [c,c]   N  or  more 
times,  and  jumps  K  or  more  times,  must  fall  below  c.   By  Fact  A. 3,  we 
know  that  for  any  given  subinterval,  the  probability  of  N  or  more 
upcross ings  is  not  more  than 

(1+0) "N  iQ/c. 

Consequently,  the  probability  that  some  subinterval  is  upcrossed  N  or  more 
times  is  no  more  than 

I(1+0)"N  iQ/c. 

To  make  Pr(E„)  ^  e/4  we  should  choose 

4Ii  /c« 

N  ;> 


log(l+*) 
This  determines  J  by  (*)  above 

J  -  2I(N+1)  +  I. 
Finally,  choose  K  by  Fact  A. 4  to  make  Pr(E_)  3  e/4. 
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